inference mechanism
DNNs May Determine Major Properties of Their Outputs Early, with Timing Possibly Driven by Bias
Park, Song, Chun, Sanghyuk, Heo, Byeongho, Han, Dongyoon
This paper argues that deep neural networks (DNNs) mostly determine their outputs during the early stages of inference, where biases inherent in the model play a crucial role in shaping this process. We draw a parallel between this phenomenon and human decision-making, which often relies on fast, intuitive heuristics. Using diffusion models (DMs) as a case study, we demonstrate that DNNs often make early-stage decision-making influenced by the type and extent of bias in their design and training. Our findings offer a new perspective on bias mitigation, efficient inference, and the interpretation of machine learning systems. By identifying the temporal dynamics of decision-making in DNNs, this paper aims to inspire further discussion and research within the machine learning community.
Privacy Inference-Empowered Stealthy Backdoor Attack on Federated Learning under Non-IID Scenarios
Mei, Haochen, Li, Gaolei, Wu, Jun, Zheng, Longfei
Federated learning (FL) naturally faces the problem of data heterogeneity in real-world scenarios, but this is often overlooked by studies on FL security and privacy. On the one hand, the effectiveness of backdoor attacks on FL may drop significantly under non-IID scenarios. On the other hand, malicious clients may steal private data through privacy inference attacks. Therefore, it is necessary to have a comprehensive perspective of data heterogeneity, backdoor, and privacy inference. In this paper, we propose a novel privacy inference-empowered stealthy backdoor attack (PI-SBA) scheme for FL under non-IID scenarios. Firstly, a diverse data reconstruction mechanism based on generative adversarial networks (GANs) is proposed to produce a supplementary dataset, which can improve the attacker's local data distribution and support more sophisticated strategies for backdoor attacks. Based on this, we design a source-specified backdoor learning (SSBL) strategy as a demonstration, allowing the adversary to arbitrarily specify which classes are susceptible to the backdoor trigger. Since the PI-SBA has an independent poisoned data synthesis process, it can be integrated into existing backdoor attacks to improve their effectiveness and stealthiness in non-IID scenarios. Extensive experiments based on MNIST, CIFAR10 and Youtube Aligned Face datasets demonstrate that the proposed PI-SBA scheme is effective in non-IID FL and stealthy against state-of-the-art defense methods.
- Asia > China > Shanghai > Shanghai (0.05)
- Asia > Japan > Kyūshū & Okinawa > Kyūshū > Fukuoka Prefecture > Fukuoka (0.04)
The logic behind desirable sets of things, and its filter representation
de Cooman, Gert, Van Camp, Arthur, De Bock, Jasper
We identify the logic behind the recent theory of coherent sets of desirable (sets of) things, which generalise desirable (sets of) gambles and coherent choice functions, and show that this identification allows us to establish various representation results for such coherent models in terms of simpler ones.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Belgium > Flanders > East Flanders > Ghent (0.04)
Towards Understanding Human Functional Brain Development with Explainable Artificial Intelligence: Challenges and Perspectives
Kiani, Mehrin, Andreu-Perez, Javier, Hagras, Hani, Rigato, Silvia, Filippetti, Maria Laura
The last decades have seen significant advancements in non-invasive neuroimaging technologies that have been increasingly adopted to examine human brain development. However, these improvements have not necessarily been followed by more sophisticated data analysis measures that are able to explain the mechanisms underlying functional brain development. For example, the shift from univariate (single area in the brain) to multivariate (multiple areas in brain) analysis paradigms is of significance as it allows investigations into the interactions between different brain regions. However, despite the potential of multivariate analysis to shed light on the interactions between developing brain regions, artificial intelligence (AI) techniques applied render the analysis non-explainable. The purpose of this paper is to understand the extent to which current state-of-the-art AI techniques can inform functional brain development. In addition, a review of which AI techniques are more likely to explain their learning based on the processes of brain development as defined by developmental cognitive neuroscience (DCN) frameworks is also undertaken. This work also proposes that eXplainable AI (XAI) may provide viable methods to investigate functional brain development as hypothesised by DCN frameworks.
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Essex (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
- (2 more...)
Investigating Efficient Learning and Compositionality in Generative LSTM Networks
Fabi, Sarah, Otte, Sebastian, Wiese, Jonas Gregor, Butz, Martin V.
When comparing human with artificial intelligence, one major difference is apparent: Humans can generalize very broadly from sparse data sets because they are able to recombine and reintegrate data components in compositional manners. To investigate differences in efficient learning, Joshua B. Tenenbaum and colleagues developed the character challenge: First an algorithm is trained in generating handwritten characters. In a next step, one version of a new type of character is presented. An efficient learning algorithm is expected to be able to re-generate this new character, to identify similar versions of this character, to generate new variants of it, and to create completely new character types. In the past, the character challenge was only met by complex algorithms that were provided with stochastic primitives. Here, we tackle the challenge without providing primitives. We apply a minimal recurrent neural network (RNN) model with one feedforward layer and one LSTM layer and train it to generate sequential handwritten character trajectories from one-hot encoded inputs. To manage the re-generation of untrained characters, when presented with only one example of them, we introduce a one-shot inference mechanism: the gradient signal is backpropagated to the feedforward layer weights only, leaving the LSTM layer untouched. We show that our model is able to meet the character challenge by recombining previously learned dynamic substructures, which are visible in the hidden LSTM states. Making use of the compositional abilities of RNNs in this way might be an important step towards bridging the gap between human and artificial intelligence.
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
- North America > United States > New York > New York County > New York City (0.04)
Addressing Design Issues in Medical Expert System for Low Back Pain Management: Knowledge Representation, Inference Mechanism, and Conflict Resolution Using Bayesian Network
Santra, Debarpita, Mandal, Jyotsna Kumar, Basu, Swapan Kumar, Goswami, Subrata
Aiming at developing a medical expert system for low back pain management, the paper proposes an efficient knowledge representation scheme using frame data structures, and also derives a reliable resolution logic through Bayesian Network. When a patient comes to the intended expert system for diagnosis, the proposed inference engine outputs a number of probable diseases in sorted order, with each disease being associated with a numeric measure to indicate its possibility of occurrence. When two or more diseases in the list have the same or closer possibility of occurrence, Bayesian Network is used for conflict resolution. The proposed scheme has been validated with cases of empirically selected thirty patients. Considering the expected value 0.75 as level of acceptance, the proposed system offers the diagnostic inference with the standard deviation of 0.029. The computational value of Chi-Squared test has been obtained as 11.08 with 12 degree of freedom, implying that the derived results from the designed system conform the homogeneity with the expected outcomes. Prior to any clinical investigations on the selected low back pain patients, the accuracy level (average) of 73.89% has been achieved by the proposed system, which is quite close to the expected clinical accuracy level of 75%.
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Therapeutic Area > Musculoskeletal (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.80)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.80)
Knowledge Discovery from Layered Neural Networks based on Non-negative Task Decomposition
Watanabe, Chihiro, Hiramatsu, Kaoru, Kashino, Kunio
Interpretability has become an important issue in the machine learning field, along with the success of layered neural networks in various practical tasks. Since a trained layered neural network consists of a complex nonlinear relationship between large number of parameters, we failed to understand how they could achieve input-output mappings with a given data set. In this paper, we propose the non-negative task decomposition method, which applies non-negative matrix factorization to a trained layered neural network. This enables us to decompose the inference mechanism of a trained layered neural network into multiple principal tasks of input-output mapping, and reveal the roles of hidden units in terms of their contribution to each principal task.
Abstracting from Observation-Equivalent Entities in Human Behavior Modeling
Schröder, Max (University of Rostock) | Lüdtke, Stefan (University of Rostock) | Bader, Sebastian (University of Rostock) | Krüger, Frank (University of Rostock ) | Kirste, Thomas (University of Rostock)
Recognizing human behavior from noisy and ambiguous sensor data is a prerequisite for many applications such as context-aware assistance. The sensor data, however, often do not allow to distinguish between multiple entities, e.g. a presence sensor does not allow to distinguish between two persons i.e. both are observation-equivalent. Conventional algorithms, however, consider each of these entities separately during the inference of human behavior, leading to a high computational burden in scenarios where a large number of entities have to be considered. Therefore, these algorithms can only be applied to very limited scenarios. We analyzed the challenges appearing in these scenarios and revealed that considering observation-equivalent entities separately is one reason for the huge computational effort. Thus, we propose to exploit observation-equivalence by representing entities as a group and inferring about these groups of entities. We sketch a mechanism that exploits observation-equivalencies which we call lifted probabilistic inference. To compare this approach with conventional inference approaches, we adapted an office scenario from the literature so that it parametrizes observation-equivalent entities and simulated a corresponding dataset. This dataset can be used as a benchmark for the evaluation of different inference approaches with respect to observation-equivalence. We compare the number of states this approach, and a conventional inference algorithm is considering during inference on this benchmark dataset. On average, the conventional approach uses almost 200,000 states to cover the situations of the scenario during the inference whereas our lifted probabilistic inference approach uses less than 100 states. Thus, an observation-equivalent approach seems promising for a more efficient inference in scenarios with many observation-equivalent entities.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > California > San Francisco County > San Francisco (0.04)
- Europe > Germany (0.04)
Tractable Epistemic Reasoning with Functional Fluents, Static Causal Laws and Postdiction
We present an epistemic action theory for tractable epistemic reasoning as an extension to the h-approximation (HPX) theory. In contrast to existing tractable approaches, the theory supports functional fluents and postdictive reasoning with static causal laws. We argue that this combination is particularly synergistic because it allows one not only to perform direct postdiction about the conditions of actions, but also indirect postdiction about the conditions of static causal laws. We show that despite the richer expressiveness, the temporal projection problem remains tractable (polynomial), and therefore the planning problem remains in NP. We present the operational semantics of our theory as well as its formulation as Answer Set Programming.
- Europe > Germany > Bremen > Bremen (0.28)
- Asia > Middle East > Republic of Türkiye (0.04)
Representing and Reasoning With Probabilistic Knowledge: A Bayesian Approach
PAGODA (Probabilistic Autonomous Goal-Directed Agent) is a model for autonomous learning in probabilistic domains [desJardins, 1992] that incorporates innovative techniques for using the agent's existing knowledge to guide and constrain the learning process and for representing, reasoning with, and learning probabilistic knowledge. This paper describes the probabilistic representation and inference mechanism used in PAGODA. PAGODA forms theories about the effects of its actions and the world state on the environment over time. These theories are represented as conditional probability distributions. A restriction is imposed on the structure of the theories that allows the inference mechanism to find a unique predicted distribution for any action and world state description. These restricted theories are called uniquely predictive theories. The inference mechanism, Probability Combination using Independence (PCI), uses minimal independence assumptions to combine the probabilities in a theory to make probabilistic predictions.